Update Cassandra example config#1120
Conversation
Cassandra has added a lot more metrics recently. The existing example config results in a scrape time of 15-30 seconds and more than doubles Cassandra's CPU usage in our setup. This changes the example config to use a whitelist approach with a sample whitelist which takes a much more reasonable 200 ms to scrape. Signed-off-by: Russ Garrett <russ@garrett.co.uk>
e4cf48c to
d5b90ed
Compare
|
@russss Thanks for the PR!!! Given the other Cassandra example is valid, I feel we should add this as another example. Thoughts? |
|
Personally, I don't think there's any use for the previous config any more. It's definitely too slow to use with a 15 second prometheus scrape interval, it causes Cassandra to use a lot of CPU, and it generates literally thousands of metrics which will blow up the size of the TSDB. This will presumably only get worse as Cassandra adds even more metrics, so I think using the blacklist approach is a bad idea in general as you risk Cassandra upgrades flooding your TSDB with new metrics (which is what happened to me). I think the only reasonable way to use jmx_exporter with Cassandra now is to cherry-pick the metrics you want to use, and I think this is a good starting point as it provides a number of example match patterns. |
|
For reference, I counted the number of metrics on our Cassandra 5.0.2 cluster, which only has a handful of tables. By my count the unfiltered If you apply the filtering in the existing cassandra.yaml, that reduces it to 15,821 metrics (2,588 histograms). My proposed change generates 68 metrics of which about 10 are histograms. |
Cassandra has added a lot more metrics recently. The existing example config results in a scrape time of 15-30 seconds and more than doubles Cassandra's CPU usage in our setup.
This changes the example config to use a whitelist approach with a sample whitelist which takes a much more reasonable 200 ms to scrape.